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INTRODUCTION 


This report examines current and emerging trends in how data is used in 
political campaigns. It is divided into three sections. The first looks at the 
state of the art in data and technology in relation to online advertising. 
The second sets out how these approaches are currently being applied 
in political campaigning, and projects how it might evolve in the next 
two to five years. The third section sets out what we consider to be the 
main risks associated with these trends. 


It is important to note that we draw on publicly available sources - 
notably academic articles, consulting reports and analytics companies 
themselves. Therefore, we are not able to review proprietary new 
technology that has yet to be made public. 


This paper has been directly informed by 37 academic papers, 37 
industry papers, 77 pieces of related material and company websites 
and five patents. 
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STATE OF THE ART 


Data 


Key trends 


e The next few years will see a proliferation of data on consumer 
demographics, behaviour and attitudes - including health and 
location data collected through smartphones and the growth of 
internet enabled devices. 

e Companies see the collection, combination and analysis of 
diverse datasets as an important source of value. Key to this value 
is the ability to connect and analyse previously discrete datasets. 

e While this analysis of connected datasets will offer new insights to 
companies, it also poses serious challenges to the privacy of 
individuals, and is likely to allow surprising and intimate inferences 
to be made. 


Increase in volume and scale 


The amount of data created worldwide is growing exponentially. IBM 
estimates that 90 per cent of the data that exists today was created in 
the last two years, with around 2.5 quintillion bytes of data produced 
each day from almost every sector of the economy.! The International 
Data Corporation predicts that, by 2025, the world will create and 
replicate 163 zettabytes — or 163 trillion gigabytes - of data every single 
year.? 


Much of this growth is due to the use of ‘big data’, a term referring to the 
storage and analysis of large, complex data sets.3 The growing 
significance and value of big data has been driven by a number of 
recent technological developments, in particular the affordability of 
computational power and data storage capacity.’ Platforms have been 
developed to store data from various sources, including sensors, apps, 
geographical location, images, videos and user behaviour, and allow 
this information to be queried together. The real value of this approach is 
the insight and patterns that can be derived from the relationships 
between data points.® 


The raw materials for big data platforms is likely to be increasingly 
supplied by sensors and devices, often referred to as the ‘Internet of 
Things’ (loT). IoT technology can be loosely defined as a network of 
‘smart’, internet enabled sensors and devices, able to communicate 
with each other and collect data about their use. In 2017, there were 8.4 
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billion connected loT devices worldwide, projected to increase to 30 
billion by 2020.6 This technology is being applied across a broad range of 
sectors, from healthcare and household devices to smart cities. 


The majority of emerging loT products are targeted at the consumer 
market. A 2017 report by TechUK predicted that, by the end of that year, 
two thirds of loT applications would be consumer-facing, and found that 
80 per cent of respondents own at least one connected device.” Over 
the next few years, it is likely that we will see this technology proliferate in 
homes, in the urban environment, and carried around by the public. 


Surveys show that consumers are keen to use loT devices in the home. 
TechuUK found that 42 per cent of respondents were interested in smart 
energy, and 39 per cent in home monitoring and control.’ By 2021 it is 
estimated that the connected home will be made up of an average 8.7 
devices, and by 2020 every person will create 1.7MB of data every 
second.’ Arecent IBM blog illustrates the potential benefits of a ‘smart 
home’, especially in respect of assisted living for the elderly, which can 
build up a model of the resident’s regular habits and schedule, and 
enable anything unusual to be flagged and a warning sent to care 
providers, !0 


A 2017 YouGov survey found that consumers are also increasingly 
adopting ‘wearable’ devices, such as fitness trackers. in 2017, the 
percentage of UK citizens who own a wearable device was 17 per cent, 
up from just 2 per cent in 2012.'' These devices are often able to 
measure and collect a range of data concerning a user's lifestyle and 
health, including fitness, dietary habits and quality of sleep. This data is 
already being used by industry, with some insurance companies using 
information from tracking devices to adjust premiums.'2 Given current 
consumer attitudes, we expect greater adoption of wearable 
technology over the coming years. Some analysts believe under-the-skin 
chips will also become commonly used — not only to track vital health 
data, but also to unlock doors, authenticate transactions and identity, 
make payments and even sense magnetic fields.!3 


This technology is not limited to consumer devices. Multiple UK ‘smart 
city’ projects are using loT technology to improve the infrastructure and 
running of urban areas. For example, the University of Bristol and Bristol 
City council have been collaborating on the ‘Bristol is Open’ project, 
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installing a city-wide network of fibre-optic cables, Wi-Fi, 3G and 4G 
mobile broadband and a ‘mesh network’ of 2,000 lampposts 
communicating via radio frequency. The goal is to provide a test-bed for 
new lol projects, connecting sensors in the field with data sets and 
computing resources.'4 Other councils, including Cardiff and Glasgow, 
are experimenting with similar fechniques.!* 


The growth of loT might change the way we engage with existing 
technology. In particular, the role of the smartphone is being increasingly 
challenged. A 2018 Accenture report found that 43 per cent of people 
believe phones will be replaced by wearables, and 40 per cent of 
smartwatch users already interact less with smartphones today. Two 
thirds of those who use a digital assistant - a demographic predicted to 
double over 2018 - say they use their smartphone less now they own a 
digital assistant.'¢ These newly adopted technologies are often 
accompanied by novel data platforms - for example Alpine.ai, which 
allows businesses to create content for Amazon and Google Assistants 
using existing corporate data.!” 


Brands have begun to exploit loT, for example through the use of 
‘connected products’.'® A recent report by Mindshare trialled 
connected food products and other household goods able to 
communicate information and deliver content to consumers; for 
example, triggering smartphone notifications to warn consumers that 
food is about to expire. They found people were positive about this new 
technology.!? 


As the loT matures, new metrics and data points will become available. 
This is especially true in the healthcare sector, with devices in 
development which could allow the public to measure their blood 
pressure, glucose levels, and even the state of their mental health.2° 
These new systems will result in new datasets that contain significant 
detail about citizens’ personal lives, health and habits. The growth of this 
data collection raises a number of serious concerns. Researchers have 
long warned of the vulnerability of loT devices to cyber-attacks and data 
exfiltration.2! Even in the absence of unauthorised access, ownership of 
data created by loT devices is unclear. For each device, the owner, the 
manufacturer and the software developers - as well as any companies 
paid to analyse or aggregate the data it collects - may all have an 
expectation to access some of the data produced.?2 
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Commercial availability 


The increased availability of data is widely viewed as a valuable 
commercial opportunity to drive efficiencies, improve products and 
create new markets and services.” It is therefore becoming a business 
imperative to collect, aggregate and analyse data.”4 To help unlock the 
value of data, the last few years have seen a growth in companies 
looking to make it easier for organisations - from consumer brands to 
political parties - to derive value from the data they hold, or have 
access to. 


A 2015 McKinsey report outlined the three main ways companies are 
accessing data.” Firstly, data can be bought. Companies like Experian or 
Acxiom offer large consumer databases, claiming to contain information 
on customers ‘purchasing habits, lifestyles and attitudes.’ These can be 
matched with internal company databases through identifiers such as 
credit card details or telephone numbers. Panel data from companies 
like Nielsen and Compete provide access to the activities of 2 million 
consumers, providing granular views of behaviour, such as records of 
web pages visited and consumer purchases made over a certain time 
period. “Traveling cookie” data from Google’s ‘AdSense’ program build 
a digital footprints for consumers, based on their logins at popular sites 
(for example, on airline sites or Facebook). Once the customer logs in, 
the cookie follows that customer across other websites. Aggregator 
companies, like Datalogix, combine this data across hundreds of logins 
and match it back to a database of more than 100 million households.” 


Second, companies can request data from customers. McKinsey advises 
retailers to ‘encourage customers to self-identify by logging in to the 
website, using a loyalty card in store, or identifying themselves when 
calling customer care.’8 Finally, companies share data through a 
partnership. For example, Visa have partnered with retailers to introduce 
highly targeted location-based offers to consumers as they make 
purchases - scan your Visa at a Gap to make a purchase, and get offers 
on your smartphone for retailers within walking distance.?? 


Combining data sets 
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The value of big data often lies in bringing together large and disparate 
data sources. loT data, social media data, geolocation data and 
browsing history all provide the raw materials for this combination. This 
aggregative approach has been embraced by governments as well as 
corporations. While urban big data is still in its infancy, a partnership 
between the MIT Media Lab, Andorra’s government and international 
companies has been set up, aiming to understand and improve the 
dynamics of tourism, commerce, human mobility, transportation systems, 
and environmental impact.%° One project - the Tourist Recommendation 
System - integrates data from cell phones, social media, bank 
transactions, energy consumption and transportation to study the flow 
and behaviour of people. This analysis is then used to predict tourist 
movement around Andorra, and to develop a targeted system guiding 
tourists to attractions, with the aim of increasing revenue. 


The combination of data sources at scale raises serious privacy concerns. 
A trade-off exists between obtaining the functional benefits of 
combining data and the danger of revealing potentially sensitive 
information. This is particularly true in the case of consumer devices. 
Amazon's ‘home assistant’ device Echo, for example, is sold with 
camera and voice set to be ‘always on’ by default, listening 
permanently for a ‘wake word’ which, if detected, will prompt it to 
capture data to send to Amazon. Smart TVs, including those sold by Sky 
and Fire TV now include similar voice activation mechanisms.’ 
Combining this data with information purchased or gathered from other 
sources makes it difficult for consumers to understand how their data is 
being collected and used, and therefore for them to provide effective 
consent or make informed decisions.%2 


Targeting 
Key trends 


e Recent years have seen an explosion in data which can 
be used to target advertising, including location, IP, 
browsing data, and information collected from devices 
and wearables. 

e Advanced techniques in data analysis have enabled 
marketing to move from area or platform-based 
approaches to campaigns which target individual users 
across multiple devices and locations. 

e Novel analysis techniques are able to infer detailed 
demographic characteristics from seemingly impersonal 
data. These characteristics can then be used to profile 
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individuals in ways which are difficult to measure, and 
difficult for consumers to opt out of. 

e The next few years are likely to see a move to increasingly 
automated marketing, where valuable individuals or 
groups are tracked, measured and targeted by 
machines, potentially using machine-generated content. 


Overview 


Targeted advertising relies on collecting and analysing big data. Two 
types of actors in particular play an important role in the data 
ecosystem: data brokers, who collect and aggregate data, and data 
exchanges, which allow advertisers to bid for often time-sensitive data 
about consumers. 


Data exchanges often trade in staggering quantities of information. 
BlueKai Exchange, which is run by Oracle and calls itself the world’s 
largest data marketplace, offers ‘data on more than 300 million users 
offering more than 30,000 data attributes.’ The exchange processes 
more than 750 million data events and transacts over 75 million auctions 
a day.’ This is often referred to as programmatic advertising — a new, 
automated method for buying and placing advertisements on digital 
media, using algorithmic processes to find and target a customer 
wherever they go.%4 According to a report from Forrester, programmatic 
marketing will account for the majority of all digital advertising soending 
within the next few years. The process involves real-time ‘auctions’ that 
occur in milliseconds, allowing bidders to ‘show an ad to a specific 
customer, in a specific context.’ 35 


Despite the existence of large trading platforms, personal data useful to 
advertisers is often collected and stored by companies, rather than 
openly traded. These ‘data silos’ are typically controlled by large 
companies such as Facebook, Microsoft and IBM, who have tended to 
capitalise on data without selling the data itself, but instead by allowing 
indirect and temporary access to it and inferences made upon it. 


Some acquisitions in recent years are thought to have been driven by 
the value of the data held by a company (for example, Microsoft’ s 
purchase of Linkedin in late 2016).3 As transferring personal data 
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between companies and across borders becomes more strongly 
regulated, we expect this trend to continue over the coming years. 


Location targeting 


The vast majority of the UK population is now using devices able to 
record their location. On top of wearable devices with ‘onboard’ 
geolocation technology, a 2017 Deloitte survey found that 85 per cent of 
the UK population own or have access to a smartphone, and the 
company predicted this year that this number will surpass 92 per cent in 
the next 5 years.9” 


This trend has been accompanied by upgrades to global geolocation 
infrastructure, notably the European Satellite Agency’s ‘Galileo’ system, 
which is already accurate to around 1 metre, and will increase the 
accuracy and reliability of location services within Europe as satellites are 
added to the system over the coming years.*8 Satellite-enabled 
technology is a growing market: the organisation ‘IoT UK’ predict that the 
global satellite Machine-to-Machine (M2M) and loT market will reach 5.8 
million in-service satellite M2M/loT terminals by 2023.8? 


A large number of mobile applications are making use of location 
technology, asking for (and often being granted) permission to precisely 
geolocate their users. In 2015, the Pew research institute found that 
281,000 Android apps - 26 per cent of all applications on the app store at 
the time - requested permission to access a user’s precise location. In 
2016, a technology company who provide location services for 
applications found that of 1 million phones using their services, 90 per 
cent had location services enabled at the device level, and 52 per cent 
had allowed the applications to access their location data. 


The ability to access a user’s precise location is seen as a great boon by 
marketers seeking ‘increased relevance’ to the browsing public, 
especially since similar techniques can be used to anticipate, for 
example, where a motorist is likely to be going.*! Geo-fencing, a form of 
mobile advertising which targets customers present in a given space, is 
increasingly seen by marketers as a way to bridge the digital and offline 
world.42 One example of this is the use of Bluetooth-enabled ‘beacons’, 
which make short-range connections to devices and send out 
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corresponding messages; for example, pinging a special offer to a 
customer who has been loitering near a product for some time.*% 


Detailed location data can also increasingly reveal how users are 
moving, which can be used to infer other demographic characteristics. 
For example, General Motors was granted a patent in March 2018 which 
describes the use of ‘vehicle trace data’, including ‘operator detection 
information, driving route information, operator driving behaviour 
information, media accessory usage information, and multimedia 
content usage information’ to infer characteristics including the age, 
gender and income of the driver. These characteristics would then be 
used to target advertising at the vehicle e.g. through messages on an in- 
car console.*4 


The ability for companies to precisely locate a user, sometimes over long 
periods of time, carries potential risks to that user’s privacy. This data is 
also difficult to anonymise and protect, and a number of cases have 
been identified where applications involving location functionality could 
be manipulated to disclose the position of other users.45 Notably, some 
consumers are also taking steps to protect their privacy online, driven by 
a wish to reduce their exposure to location-based advertising.*¢ 


One widely used approach to geolocation is through IP address, a 
ubiquitous identifier assigned to any device connecting to the Internet. IP 
addresses alone, however, are not usually sufficient to identify a given 
user, due to a number of factors. For instance, home and commercial 
networks, including public Wi-Fi hotspots, often use a single ‘public’ IP 
address for multiple devices, meaning the members of a household, or 
patrons of a pub, will all connect to a service with the same address. 
These public addresses typically belong to, and can be traced to, 
Internet Service Providers (ISPs) or companies.’” IP addresses are also 
changeable - most consumer electronics are also allocated ‘dynamic’ 
addresses using DHCP, a protocol through which an IP will be assigned to 
a device for a fixed length of time, then reallocated as necessary. 


Despite these difficulties, IP addresses are still useful fo marketers, and 
can be a useful part of a marketer's targeting arsenal. IP addresses can 
often be used to infer the location and travel habits of a user when 
accurate geolocation data is not available. In cases where an IP 
address is not alone sufficient to pinpoint a user, tracking can often be 
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achieved by combining the address with other identifying data; for 
example, the presence of tracking cookies, or the precise website visited 
from an adaress.48 


The loT location services market is rapidly growing,*? and marketers are 
beginning to see the potential in combining loT devices and location 
tracking, for example to offer timely location-based offers. One possible 
future for loT and location-based marketing is outlined in a recent blog 
by IBM, which suggests a combination of data on diet, sleep and 
exercise Could be used by a supermarket to send healthy food offers at 
the nearest store.©° The Royal Academy of Engineering warned recently 
that loT devices posed a particular new risk to privacy in the way that a 
person's location could be more easily identified through analysis of 
devices.$! 


Increasing granularity 


The broad trend is towards ever more granular audience segmentation. 
Facebook and Google provide a number of tools that allow companies 
to segment their audiences and target their adverts. Facebook tools 
include the ‘data file custom audience’ (enables advertisers to reach 
existing customers on Facebook using information those advertisers 
already hold e.g. the customer has already given their email), ‘website 
custom audience’ (enables advertisers to target people on Facebook 
who have visited their website, through use of the Facebook Pixel) and 
‘lookalike audiences’ (this enables advertisers to reach new people on 
Facebook who are likely to be interested in their business because they 
are similar to existing audiences) .*? 


This final tool is of particular interest. According to Salesforce, lookalike 
audiences are created ‘based on extensive social graph data including 
demographics, interests, social connections, and newsfeed activity.’ 53 
The ability to tailor these ‘lookalike’ audiences is becoming increasingly 
granular. In 2017, Facebook added ‘value-based lookalike audiences’ 
for commercial businesses which, according to the site ‘creates an 
additional weighted signal for people most likely to make a purchase 
after seeing your ad.’54 Google also offer similar tools, called ‘customer 
match targeting’, allowing marketers to find their customers using an 
email, phone number or address.55 
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Several companies are developing similar approaches. Companies such 
as Outbrain, a content discovery platform, describe a three step process 
where they first ‘identify a Custom Audience as a “seed”’ followed by a 
‘Personal Interest Profile [which] reveals your seed audience's interests 
and reading and watching habits’, finally creating a lookalike audience 
which ‘adapts in real-time based on live campaign learnings and 
performance.’* 


More broadly the trend is toward finer and smaller audience targeting. 
SimMachines, a machine learning software company, has developed 
what it calls a ‘dynamic predictive segmentation’. SimMachine’s chief 
marketing officer recently set out how ‘Dynamic Predictive 
Segmentation provides clients with highly precise, contextually relevant 
and inherently actionable insights as to the motivating drivers behind a 
customer's predicted behaviour at machine learning speed and 
scale.’ Marketo is using Al both to segment audiences as well as to 
deliver tailored content.58 Other companies such as Intent Targeting are 
looking to create ‘incredibly sophisticated and granular audiences.’ 
Intent Targeting claims to provide ‘more context and more information 
than any segmentation or audience building engine before.’5? These 
tools provide marketers with highly targeted, and granular data, allowing 
advertisers to narrowly define their target segment. 


The end goal of this trend towards segmentation might be to target 
consumers as individuals. According to the chief marketing officer of 
Unilever, the future will be ‘segments of one.’ Similarly a 2016 
CapGemini report asks ‘if it still makes sense to use micro-segmentation 
when today it is possible to target customers directly and individually with 
the help of data science and new technologies.’*'! A 2018 paper titled 
‘Facebook's Advertising Platform: New Attack Vectors and the Need for 
Interventions’ raises concern over the potential this granular targeting 
has to enable privacy violations, showing that Facebook’s Audience 
Insights tool, which provides advertisers with details of who their ad 
reached, ‘can be run on audiences as small as one person - and when 
run, provide insights that include 2,000+ categories of information.’ 62 


One key aspect of this increasing personalisation is cross-device 
targeting, which tracks people rather than devices. This approach 
recognises that the same person uses multiple devices and seeks to 
establish ‘a person-centric view of a user across devices.’ It, Evan 
Neufeld says in Journal of Advertising Research, is becoming ‘mandatory 
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practice’ .64 


According to the US Federal Trade Commission companies use a 
combination of methods to identify and reach consumers.® There are 
broadly two types of cross-device targeting: deterministic and 
probabilistic. 


e Deterministic targeting identifies people by tying devices to a 
common persistent identifier — such as a name or email address. 
These unique identifiers are collected when consumers provide 
details at websites, for example when creating a log in. 


e Probabilistic targeting works by first identifying various devices (for 
example, through a cookie, hardware identifier, or device 
fingerprint), and then comparing collected information about 
those devices for common identifiers to infer a likelinood of 
whether those devices are used by the same person. Common 
identifiers might include IP addresses, location or browsing 
patterns. Estimates on the accuracy of probabilistic device 
correlations range as high as 97 per cent.% 


Through the use of probabilistic targeting, companies are increasingly 
able to correctly link devices even in the absence of a persistent 
identifier such as email addresses or user name. This - combined with the 
fact that cookie blocking software or third-party connections may not 
prevent companies collecting other identifiers such as IP address - can 
make it difficult for consumers to opt out of cross-device tracking. The 
consultancy McKinsey describe how Al-based cross-device targeting 
increasingly allows companies to optimise this probabilistic tracking over 
time through a ‘self-learning’ process. The company considers this to be 
a potentially important development, to help ‘provide real-time offers 
targeted to individual customers’ .$” 


Facebook and Google both offer cross-device targeting on a 
deterministic basis. Some Companies offer a mix of deterministic and 
probabilistic methods. ć8 One example is Lotame, a data management 
platform. Lotame sets out how it thinks cross device targeting adds value 
to marketing campaigns: ‘Let’s say you have a user bored at work 
looking up your product online. You can follow them with your marketing 
throughout the rest of their day. You can target them during their 
evening commute on their phone with a relevant ad. Then, later, when 


The Future of Political Campaigning 


sitting on the couch you can serve them a relevant ad via their 
television. Finally, before they doze off, you can serve them another ad 
on their tablet.’ 6? 


The variety and ubiquity of loT devices will also affect a marketer's ability 
to target audiences through multiple aspects of their behaviour, with 
some commentators heralding the loT as ‘the future of marketing.’7° 
Advertisers like BannerFlow are positive about the opportunity provided: 
‘What’s certain is that the IoT will provide new data, touchpoints and 
opportunities, which both ad tech, and users of ad tech, must use to 
make experiences more personal, relevant and targeted.’7! One 
academic paper seeking to define and explore the possible applications 
of advertising using loT (we believe the first paper of its kind) sets out how 
loT will ‘open up a novel, large- scale, pervasive digital advertising 
landscape,’ concluding that there should be an loT advertising platform 
just like the existing platforms for internet advertising. 72 


Companies investing in developing new cross-device targeting methods 
often seek to protect their advances through patents, which can offer 
some useful insight into how the technology might develop. In the 
context of the move towards targeted advertising via lof, it is notable 
that companies are seeking to future-proof their cross-device targeting 
methods. A patent submitted by Gula Consulting LLC (on cross-network 
user behavioural data) states that ‘although embodiments of the present 
invention are described herein in the context of the Internet, the present 
invention is not so limited and may be used in other data processing 
applications’. Intent IQ's patent on television advertising specifies that 
television advertising can be targeted based on the user profile collated 
from multiple online devices, only one of which needs to be directly 
associated with the set-top box (the device that connects a television 
and a signal source).”4 Yahoo! Inc's patent on cross device targeting sets 
out how once they have mapped a user to their devices, they can pull 
up a profile of that user to relate the user's likes and dislikes to target 
specific ads to that user on any of the user's devices, whether or not the 
user is logged in.” This patent also specifies that patterns of how the 
person uses the internet while online, like the font they use for typing and 
the arrangement of their applications, can be used to identify users.’¢ 


It is likely that there will be an explosion of competition for this market: 
with Amazon and Google facing competition from Sonos, Samsung, 
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Apple, Microsoft, and possibly Facebook.” Amazon, for example, holds a 
patent that describes an algorithm that can listen to entire 
conversations, using ‘trigger words’, such as like and love, to build a 
profile of customers.’”8 The document states the system could then offer 
‘targeted advertising and product recommendations’. 


There is a potential tension between personal privacy and choice and 
the goals of cross-device targeting. (Intent IQ's patent on television 
advertising highlights this explicitly).77 Academics have recently noted 
that cross device targeting can reveal a more complete picture of a 
person and, thus, ‘become more privacy-invasive than the siloed 
tracking via HTTP cookies or other traditional and more limited tracking 
mechanisms.’®° To add to these privacy concerns, the practice of cross- 
device tracking is difficult to monitor, since companies can make 
determinations of device correlation on their own servers in a manner 
opaque to end users and regulators.®! 


The metrics used to identify individuals often arise from surprising sources. 
Using indicators such as the speed of typing, duration of keypresses and 
typing errors, Performetric (a technology company aiming to improve 
workplace wellbeing) has developed software capable of detecting 
patterns associated with mental fatigue. Both keyboard and mouse 
interaction metrics combine to create a unique profile that provides the 
user with recommendations that aim to improve mental health.®2 


Ability to potentially improve the effectiveness of communication 


The production and evaluation of advertising content - particularly in the 
fields of focus groups and split testing - has been supercharged by Al 
tools such as Facebook's Dynamic Creative and Albert.ai.88 These allow 
content producers to create and test vast numbers of adverts to find the 
optimal combination of design features for a given ad campaign, and 
for a given target audience. By using engagement metrics such as view 
times, click-through rates, and sales conversions, these tools allow 
producers to pinpoint the design feature that resonates best with certain 
groups, be it the colour of a ‘click here’ button or a particular image. 


In one example of this, a product by Dynamic Creative allows advertisers 
to provide a maximum of five variations each for titles, images, text, 
descriptions, and ‘calls to action’. An algorithm is then used to 
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automatically generate a series of ad variants based on these 
components.® By providing the basic structure and building blocks of an 
advert, this system automates the process of evaluating content, 
delivering the best performing combinations broken down in terms of 
audience segments. This allows advertisers to carry out many more 
experiments ‘in the field’ than would be possible if manually generated 
and tested. 


This approach does not involve the production of entirely novel content. 
All of the constituent tools, however, exist for personalised messaging to 
become (near) fully autonomous; though these technologies are still at 
an early stage of development. With the integration of broad data 
sources and Al-generated text, video and audio (discussed below) some 
advertisers are making the first steps to directly personalised automated 
advertisements.® IBM's Watson Advertising have piloted several 
advertising campaigns with Toyota, GSK and Campbell's Soup, which 
bring together personal data, user engagement, and dynamic content 
generation.86 Campbell, for example, have used this approach to 
advertise soup when weather data in a local area indicates it is cold or 
raining, and to suggest a set of dishes and recipes given an ingredient 
provided by a user. This system can learn dietary preferences through 
repeated interactions - for example, users likely to be vegetarian can be 
offered suitable recipes.8” 


Other large companies have shown signs of researching automatic 
generation of advertisements. In 2017, the senior digital director of Coca 
Cola signalled his intention to use Al to help generate music and scripts. 
A recent blog for IBM suggested that Watson Cognitive Computing 
would be able to serve an advert to a particular individual it knows has 
recently bought a fridge. Given this knowledge it would generate an 
advert involving the fridge and given the individual's particular brand 
loyalty ‘dynamically swap in a product they love — such as Coke for 
Pepsi — into the video the consumer is watching to create a powerful, 
personalized brand experience. '®? 


Some targeted advertising aims to use people's networks to pique 
interest. The idea is that if people you know, or who share your interests - 
are talking about something, you may be more likely to buy it. One 
approach, employed by Facebook, is to send the same messaging to 
every member of a family.”° One method for leveraging this network 
effect is through the identification of influencers, a technique already 
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commonplace in marketing. Influencers range from celebrities to 
bloggers to families, and are recognised by brands as valuable ways to 
improve brand messaging and recognition, and ultimately improve sales. 
According to David Yavanno, a digital marketing specialist, employing 
Al to identity key influencers within a given group of people is likely to 
become increasingly important over the coming years.?! Similarly Lyle 
Stevens, CEO of influencer marketing firm Mavrck, predicts that software 
will soon automate the task of identifying influencers and encouraging 
them to post content.%2 


Artificial intelligence 
Key trends 


e Existing technology has already allowed Al systems to outperform 
human experts in a number of tasks, and Al technology is likely to 
surpass human competence across a growing range of domains 
in the near future. 

e Recent developments in Al have allowed systems to produce 
original and realistic visual and audio content, a capability which 
blurs the line between human and machine-produced content, 
and is set to improve. 

e Evolving techniques in deep learning are enabling systems to 
decide, for themselves, how to make detailed inferences from 
highly abstract datasets. This is likely to create revealing 
information about users, even though the datasets themselves 
may contain little or no personal data. 


Artificial intelligence, machine learning and deep learning 


Artificial Intelligence (Al) can be loosely defined as a branch of 
computer science which studies and develops systems able to perform 
tasks which would typically require human intelligence. The technology 
has recently enabled important breakthroughs in diverse fields, including 
language translation, navigating roads and playing games.” A recent 
survey asked 352 experts in the field whether and when they expected Al 
systems to outperform humans over a variety of tasks, including 
performing surgery and writing bestselling books. Researchers resoonded 
that there was ‘a 50 per cent chance of Al outperforming humans in alll 
tasks in 45 years, and of automating all human jobs in 120 years.’%4 


Most current Al systems are ‘narrow’ applications — specifically designed 
to tackle a well-specified problem in a single domain.” (The possibility 
and challenges of superintelligent Al systems, which would be 
competent across all relevant domains, are beyond the scope of this 
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paper. There is currently no consensus on the likelinood or timeline of the 
development of superintelligent Al.)?ć 


One branch of Al which has seen significant recent developments is 
machine learning, an approach which allows systems, given enough 
data and time fo train, to learn how best to solve a problem or imitate 
human decision making. The approach reduces dependence on subject 
matter expertise - a machine learning system could be trained to work 
out how to recognise a smartphone user by the sound of their voice, for 
example, without requiring an expert in soeech to sit down and develop 
a strict set of rules to follow. 


Recent, well-documented advances in machine learning have enabled 
machines to compete with or outperform humans at complex tasks. For 
example Oxford University have developed a system able to lipread 
better than human professionals, and researchers at Stanford have 
trained a system to be able to identify skin lesions at a level on a par with 
human dermatologists.” Many people in Silicon Valley believe that 
machine learning is the next big thing. Andrew Ng, former chief scientist 
at Baidu, reckons that there isn’t a single industry that won't shortly be 
‘transformed’. Indeed, over the last year alone, inroads have been 
made into tasks including driving, bricklaying, fruit-picking, burger- 
flipping, banking, trading and automated stock-taking. Legal software 
firms are developing statistical prediction algorithms that can analyse 
past cases and recommend trial strategies. In recruitment, tools to 
analyse CVs are now routinely used by companies to help them filter out 
obviously unsuitable candidates. 


Some of these breakthroughs have been achieved through the 
application of deep learning, a branch of machine learning which 
enables machines to learn not only how best to classify a given input, but 
also to work out what the important features of that input might be.?8 A 
notable aspect of deep learning is its ability to draw from large, diverse 
datasets, and make surprising inferences from seemingly innocuous 
data. In arecent example, researchers at MIT have published a paper 
which uses deep learning to accurately infer a phone owner's age and 
gender using only metadata concerning their phone calls; e.g. the time 
a call was made and its duration.” As datasets owned by companies 
increase in size and complexity, and further public datasets become 
available, deep learning will allow companies to draw increasingly 
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detailed inferences about their customers, and connect data in ways 
that may not occur to human analysts. 


Machine learning is a vibrant academic field, and recent years have 
seen the development of new techniques and methods, each with their 
own strengths and applications. These include Convolutional Neural 
Networks (CNNs), which have enabled advances in deep learning, and 
Generative Adversarial Networks (GANs), notably employed by 
Deepmina’s image generation software, which can be used to 
synthesise, or ‘hallucinate’, original and realistic images, including of 
human faces. !00 


Recent advances in this field are expected to accelerate development. 
These include improvements in unsupervised learning techniques, which 
remove the need for systems to be trained on large sets of labelled data, 
and the increasing availability of soecialised hardware, including ‘tensor 
units’, which are custom built for use in machine learning.'°' As the 
battery life of smartphones increases over the next three to five years, 
machine learning capability is expected to move from data centres to 
devices; especially useful for applications, such as in-camera image 
recognition, which benefit from low latency. Huawei announced in 2017 
that they were developing an Al-tuned mobile processor,'°2 and Google 
has recently released a mobile version of their TensorFlow platform.!% 


Possible future applications for Al are diverse and contested. A good 
overview is given by Nick Bostrum, the director of Oxford University’s 
Future of Humanity Institute. In addition to likely positive applications, 
such as a reduction in road accidents as machines begin to help us 
drive, Bostrum expresses Concern over possible uses of autonomous 
weapons, the prospects of new forms of government and private 
surveillance. Increased reliance on complex autonomous systems for 
many essential economic and infrastructural functions may create novel 
kinds of systemic accident risk or present vulnerabilities that could be 
exploited by hackers or cyber-warriors. '04 


Sentiment analysis, image recognition and links to mood 


Sentiment analysis seeks to understand a person's position, attitude or 
opinion towards a specific topic, person, or entity, and has many 
applications from companies seeking to understand how consumers 
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understand their brand to political polling.!°5 These approaches typically 
employ Natural Language Processing (NLP), a tyoe of machine learning 
which analyses unstructured text. 


As in other areas of Al, the key challenge for NLP is to convert messy 
human data - the text of a social media post, for example, or a YouTube 
video - into a discrete measure of sentiment. Analysing the sentiment of 
a sentence as a human would, however, is a deceptively difficult 
problem.!° Wrapped up in an assessment of whether a post is positive or 
negative are challenges not only of detecting emotive language, but 
also disambiguating the senses of words, identifying sarcasm and 
working out where each sentence actually ends. 


One result of these difficulties is that the ability for machines to 
accurately assess sentiment in general remains elusive. The authors of a 
study published in 2018, which attempted to apply 6 separate state-of- 
the-art sentiment analysis tools to a novel dataset (responses to questions 
on the software discussion site Stack Overflow) were forced to publish 
negative results after no tool was able to reach a satisfactory level of 
accuracy. ‘Our results’ write Lin et al., ‘should warn the research 
community about the strong limitations of current sentiment analysis 
tools.’ 107 


Recent advances in the science, however, might help overcome some 
of these limitations. One approach, termed ‘multimodal sentiment 
analysis’, brings in non-text modalities, including soeech and vision. This 
method is arguably well-suited to modern social media use, where 
content increasingly incorporates video and images as well as text. 


Current research in multimodal analysis is focussed on three key areas: 
spoken reviews and vlogs, human-machine and human-human 
interaction, and visual sentiment. A key strength of this approach is the 
ability to study inter-modality dynamics: the interactions between 
language, sight and sound that change the perception of the expressed 
sentiment. ‘Emotional analysis’ is one variant of this approach, where 
emotions are extracted from images and video via facial-expression 
analysis, or from speech or text, with the aim of quantifying emotional 
reactions to what we see, hear, and read. Providers include Affectiva, 
Emotient, and Realeyes for video, Beyond Verbal for soeech, and 
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Kanjoya for text. Adopters in this rapidly expanding market include 
advertisers, media, marketers, and agencies.'!% 


Combining different data points to determining mood or sentiment holds 
considerable potential. As noted in one recent paper, ‘physical states 
such as running or walking can be inferred from accelerometer data; 
colocation with other devices can be detected using Bluetooth; 
geolocation can be established using Wi-Fi, Global Positioning System 
(GPS), or Global System for Mobile (GSM) triangulation; and social 
interactions can be measured by records of text messages and phone 
calls. These data can be recorded by dedicated apps, such as 
EmotionSense, which measures emotional states based on the speech 
patterns and matches it with physical activity, geolocation, and 
colocation with other users.!0 


There are a number of commercial actors similar to EmotionSense 
employing sentiment analysis in their products. Affectiva, soun out of 
MIT's Media Lab, develops software which claims to track sentiment and 
emotion-tracking, and are moving towards fully multimodal sentiment 
analysis, detecting emotion ‘the way humans do, from multiple 
channels.’!!° Similarly in the smartphone market, Beyond Verbal 
specialise in voice sentiment analysis to ensure that Al-driven Personal 
Assistants are ‘emotionally aware as well as more in tune to the 
customers’ needs and expectations.’!!! 


One of the next wave developments in retailing might be the use of 
facial recognition technology - and possibly biometric data — to analyse 
patterns of behaviour and respond accordingly. A customer standing in 
a shop and looking confused, for example, might be offered guidance 
through a text message.'!? Both Beyond Verbal and Emotion Sense 
suggest their technology can be used to tailor content 
recommendations - for music or restaurants, for example - based on 
consumer mood and reactions to stimuli. 


Psychographics 


‘Psychographic’ techniques aim to determine the specific personality 
types, attitudes, values, and interests of users, and to produce content 
that is informed by that user’s specific personality profile. For several 
years companies have seen this approach as a potentially important 
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source of insight to supplement demographic data. In its broadest sense, 
psychographic techniques have been in use for many years; personality 
based profiling using surveys, market research and focus groups has a 
long history in advertising.!!3 


The key new development in this field (from the perspective of this 
paper) is the use of digital data to derive psychographic insight, often at 
very large scale. The big data explosion has vastly increased the amount 
of data available for psychographic profiling. This data-driven approach 
has become popular in advertising in recent years, stimulated in part by 
falling engagement rates with traditional targeting techniques, in part by 
recent findings that (unsurprisingly) people with different personalities 
tend to exhibit different purchasing behaviours.'!4 


One method employed in this form of psychographics is to ask people to 
undertake personality tests, and then cross-reference the results against 
online behaviour or online data (for example, Facebook likes). By 
drawing a correlation between the two, it is then potentially possible to 
determine personality profiles from other users’ online behaviour alone. 
Psychographic data can also be inferred from records of a subject’s 
behaviour online. Even an understanding of a customer’s search terms, 
or a list of their likes on Facebook, can be processed to offer some broad 
insight into whether they might be extrovert, enjoy travelling, or so on. 


In 2013 Dr Michal Kosinski et al published a paper showing that a user’s 
Facebook likes can be used to quickly and accurately predict sexual 
orientation, ethnicity, religious and political views, personality traits, 
intelligence, happiness, use of addictive substances, parental 
separation, age and gender.!'5 


How well targeting based on inferred personality models actually works is 
hard to determine, though Dr Kosinski ef al’s recent paper suggests that 
this approach can improve the performance of marketing. Through the 
use of psychologically tailored advertisements, their study reached over 
3.5 million individuals, and found that persuasive appeals that were 
matched to people's extraversion or openness-to-experience level 
resulted in up to 40 per cent more clicks and up to 50 per cent more 
purchases than their mismatching or un-personalized counterparts. 
Interestingly, this study may underestimate the effectiveness of this 
approach - Kosinski et al approximated people's psychological profiles 
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using a single Facebook Like per person, instead of predicting individual 
profiles using people's full history of digital footprints, which could prove 
more effective.!'6 


The ongoing trend of integrating diverse datasets into psychographic 
techniques is likely to improve the effectiveness of these techniques. As 
Deloitte's 2017 Tech Trends report explored, insight-rich data can be 
gathered from a host of sources - transactional systems, industrial 
machinery, social media, loT sensors - as well as from and from non- 
traditional sources such as images, audio, video, and even the deep 
web.!!7 The continued increase in available data — including social data 
— promises ever deeper insight into personality based profiling, which is 
likely to drive a marked improvement in ad engagement. In a recent 
analysis of the effectiveness of these techniques, Kosinski et al argue that 
the availability of new datasets will help progress the field: 


‘As more behavioural data are collected in real time, it will 
become possible to put people’s stable psychological traits in a 
situational context. For example, people’s mood and emotions 
have been successfully assessed from spoken and written 
language, video, or wearable devices and smartphone sensor 
data. Given that people who are in a positive mood use more 
heuristic—rather than systematic—information processing and 
report more positive evaluations of people and products, mood 
could indicate a critical time period for psychological persuasion. 
Hence, extrapolating from what one does to who one is likely just 
the first step in a continuous development of psychological mass 
persuasion’ .!'8 


Other techniques are also being applied to online data in an attempt to 
investigate the inner lives of users. Social media data, for example, can 
be used to better understand a user's mental health. In one recent 
paper, Choudhury et al argued that Twitter has potential as a tool for 
measuring and predicting major depression in individuals, with their tool 
providing 70 per cent classification accuracy (i.e. seven in ten that were 
correctly identified as depressive were in fact depressive — although this 
measure does not include how many false positives were also found). 
Additional studies show that characteristic language use patterns 
associated with depression may allow for the detection of mental illness. 
While the findings from these studies are still preliminary, there is evidence 
to suggest that automated detection methods applied to large-scale 
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monitoring of social media may assist with future screening 
procedures.'!? This is likely to create significant ethical challenges. 


There are very few papers which examine in detail the likely evolution of 
psychographics. However, one probable area of growth - based on 
current trends - is the further integration of new forms of data. This would 
entail building algorithms which use, for example, smart device use, Fitbit 
data, cookie (i.e. web-browsing) data, health app data - and cross 
referencing it against people's known personality profiles. It is probable 
that new correlations will be found between online behaviour and 
personality types, from data sets one would not normally associate with 
personality profiles. These inferences may be so complex that it would be 
hard to determine what key data signals were. 


Automated content generation 


One important development in Al in recent years is the ability to 
generate content automatically (one version of this is called ‘Natural 
Language Generation’, or NLG). While previous content generation 
relied on rules based approaches, a number of companies are now 
building NLG AI.12 NLG can include e-mail, text messages, summaries, 
and translations (providers include Arria, Narrative Science, Automated 
Insights, Data2Content, and Yseop).!?! 


Research in relation to NLG is generating promising results, particularly in 
terms of the authenticity of the text. A recent research paper, published 
by Tang et al, proposed a novel approach for context-aware NLG to 
produce ‘natural’ fake reviews.'?2 They found that fake reviews were 
judged to be real 50 per cent of the time by human judges, and 90 per 
cent of the time by a state-of-the-art fake review detection algorithms. !28 
Commentators predict a future, in a few years’ time, where NLG and NLP 
will go hand in hand to generate narratives in real time based on 
unstructured data (text, pictures & videos) .!24 


Some companies see this as an opportunity to disrupt the digital content 
industry.'2° For example, Narrative’s Al product claims to helo companies 
generate ‘millions of narratives per day’ far more cheaply than human 
content producers.!26 In one recent report for Oxford University, Nic 
Newman reports that three-quarters of Editors, CEOs, and Digital Leaders 
are planning to actively experiment with Al to support better content 
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recommendations and to drive greater production efficiency. The Times 
and Sunday Times are building a recommendation service called James, 
which aims to personalise each edition in terms of format, time, and 
frequency.!27 


One new approach called Generative Adversarial Networks (GANs) may 
speed up the rate of improvement. Using GANs, two Al systems can spar 
with each other to create ultra-realistic original images or sounds. This 
gives machines something akin to a sense of imagination, which may 
help them become less reliant on humans. Some analysts have noted 
the rapid improvement of artificially generated faces, using this 
technique.'28 Perhaps predictably, this has also been accompanied by 
concerns about how it also creates ‘alarmingly powerful tools for digital 
fakery’ 12? 


Commercial availability and cost 


The consultancy firm PricewaterhouseCoopers has recently predicted 
that global GDP will be 14 per cent higher in 2030 as a result of Al, with 
increases in labour productivity and consumer demand.!°° One factor 
driving this economic value is the increasing availability to a range of 
actors, outside of technology and academia, of state-of-the-art Al 
software. The MIT Technology Review has named widely available Al as 
one of their ‘breakthrough technologies of 2018,’ highlighting recently 
developed cloud-based offerings from Google (TensorFlow; which is also 
open-source) and Microsoft and Amazon (Gluon).!*! These cloud-based 
services are likely to decrease the cost and technical requirements for 
companies wanting to experiment with Al, allowing companies from 
diverse sectors to apply the technology to a variety of different datasets. 


Despite this opportunity, a 2017 McKinsey report suggests that it may be 
some time before Al is widely adopted by enterprise. Over the short term 
the companies found most likely to invest in (and benefit from) Al are 
likely to be large enterprises ‘at the digital frontier’; those in sectors like 
telecommunications, financial services and healthcare who are likely to 
have already implemented data processing technology. Amongst those 
more reluctant to engage, McKinsey found that ‘poor or uncertain 
returns were the primary reason for not adopting,’ suggesting that 
various less technical sectors are likely to need to see a viable use case 
for the technology before they commit.!s2 
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Based on current trends, we believe that while the economy as a whole 
may take some time to work out how best to deploy Al, open tools like 
TensorFlow are likely fo encourage researchers and startups to develop 
and experiment with new methodologies in a range of fields. By lowering 
the bar to entry into Al, the increasing availability of these tools is likely to 
drive the next wave of innovation, as well as a new generation of 
methodologies for measuring and targeting populations. 
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USE IN POLITICAL CAMPAIGNS 


This section examines current and potential uses of the above 
technologies in political campaigns. Given many of these technologies 
are still developing, much of this section involves speculation on how 
some of they might be applied in the coming years, rather than 
reflection on how they are used now. We distinguish between current 
and possible uses throughout. Further, while we do not comment on the 
likely effectiveness of these techniques, it should be noted that the 
effectiveness of any campaigning is itself unclear - a recent meta- 
analysis of 49 studies about political campaigns in general found that 
contact from political campaigns had very little impact on voters’ 
choices, !38 


Current state of play 


While Donald Trump’s campaign during the 2016 US election received a 
lot of media coverage for its use of big data analytics, similar 
approaches have been used by a number of campaigns in recent years. 
During the EU referendum in the UK, Dominic Cummings estimates that 
Vote Leave ran around one billion targeted adverts in the run up to the 
vote, mostly via Facebook. Like Trump’s campaign, they sent out multiple 
different versions of messages, testing them in an interactive feedback 
loop.!34 In the 2017 UK general election, the Labour Party used data 
modelling to identify potential Labour voters, and then target them with 
messages.!35 Through the use of an in-house tool called ‘Promote’ which 
combined Facebook information with Labour voter data, senior Labour 
activists were able to send locally based messages to the right (i.e. 
persuadable) people.!%¢ There are a host of similar examples from other 
countries around the world too.!37 


Elections are becoming increasingly ‘datafied’, with advertising and 
marketing techniques being offered by a network of private contractors 
and data analysts, offering cutting-edge methods for audience 
segmentation and targeting to political parties all over the world. Many 
of these techniques were first developed in the commercial sector - as 
pointed out by Chester and Montgomery in a 2017 paper, ‘electoral 
politics has now become fully integrated into a growing, global 
commercial digital media and marketing ecosystem that has already 
transformed how corporations market their products and influence 
consumers’ .!38 
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For years, political campaigns have combined their own data on voter 
behaviour with commercial data sets available from data brokers, in 
order to build more detailed profiles of voters. Several firms also offer 
assistance in mining and targeting voters, including so called ‘marketing 
clouds’ offered by, among others, Adobe, Oracle, Salesforce, Nielsen, 
and IBM.'*? These services collect and analyse data about individuals 
from a wide variety of online and offline sources, which can then be 
used to target potential voters, either through ‘ad exchange’ systems or 
directly through large social media platforms. 


UK political parties have begun to invest in data talent, implementing 
data management approaches described above in order to more 
effectively target voters.'4° The Labour party has used NationBuilder, 
while the Liberal Democrats have opted for NGP VAN, a platform 
employed by the U.S. Democratic Party to increase voter turnout. These 
software platforms allow political parties to target individual members of 
a given constituency, allowing for more targeted messaging. "4! 


It is reasonable to assume that political campaigning will continue to 
evolve, and will adopt many of the state-of-the-art techniques being 
developed in marketing and advertising technology. The allocation of 
political campaign budgets supports this assertion. For the year 2015, the 
first year in which digital soending was reported separately by the 
Electoral Commission, around 23 per cent of the total soend was digital, 
with the majority of this being spent on Facebook.!42 


We have identified seven key trends in this area, looking how data 
analytics are being used in political campaigns already, and how it 
might develop in future. 


Trend 1: Detailed audience segmentation 


Increasing Customer segmentation allows audiences to be divided into 
ever smaller groups on the basis of granular information about their 
demographics, behaviour and attitudes. As Martin Moore and Damian 
Tambini set out in their recent paper on social media power and election 
legitimacy, ‘of particular interest to political strategists and campaigners 
is the fact that data-driven campaigns offer superior targeting and 
audience-segmentation capabilities. Campaigns can get the messages 
they think will be most persuasive to people who are undecided but 
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likely to vote, in the constituencies that might swing the election, or key 
voters in a referendum.’ !43 


The move towards increasing atomisation of voters is reflected in the 
features offered by new commercial products. In 2016, for example, 
global ad giant WPP’s Xaxis system launched ‘Xaxis Politics.’ This claims to 
be capable of ‘reaching US voters across all digital channels’ and to 
‘segment audiences by hundreds of hot button issues as well as by party 
affiliation’, including via ‘real-time campaigns tied to specific real-world 
events’.'44 Another example is L2, a data company specialising in data 
enhancement, who offer ‘voter file enhancements’ where ‘all records in 
the national voter file are passed through dozens of processing steps’. 
This includes ‘lifestyle data enhancements’ (including data on income, 
occupation, education, likely marital status, ethnic and religious 
identification, likely primary language, magazine category subscriptions, 
pet ownership and so on) and ‘modelling enhancements’ (including 
‘self-reported views on issues including: gay marriage, gun control and 
immigration’ ).!45 


One important development in targeting is the continued improvement 
of ‘lookalike’ modelling to identify potential supporters and voters. These 
are sometimes called ‘peer groups’ or ‘persuadables’ and allow political 
campaigners to reach new people on Facebook (or elsewhere) who are 
likely to be interested in their party because they are similar to existing 
audiences. The use of lookalike audiences is already commonplace in 
the political world. The Facebook ‘lookalike’ service was used extensively 
by the Trump campaign, and recent research on the Dutch 2017 
national election campaign finds that ‘nearly all campaigns use its 
lookalike audiences function to find new potential voters.’'4¢ Moore and 
Tambini raise concerns about the potential uses of this segmentation, 
writing that ‘this profiling procedure may inadvertently result in different 
messages being targeted on the basis of protected characteristics, such 
as ethnic or religious grouping.’!4” 


Customer segmentation is becoming more sophisticated, and drawing 
on more data sources to create ever more granular categories of 
people, and promising new insight into user motivation and beliefs.'48 It is 
reasonable to assume that these new techniques will shortly be 
employed in political campaigns, whereby user data from loT devices in 
particular are used, twinned with machine learning, to gain ever more in- 
depth segmentation of potential voters. 
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An important voter demographic to persuade will be ‘influencers’. 
Research by Nielsen finds that friends remain the most credible form of 
advertising among consumers, and the enlisting of key influencers - even 
at a relatively low level - fo promote brand messages is already 
commonplace both in marketing (as set out in Section One) and political 
campaigns.!4? This demand, too, has been recognised by companies 
offering campaigning tools. Facebook enables candidates to 
deliberately target people who post a lot about politics, and, (according 
to business magazine FastCompany), defines ‘political influencers’ as 
people who click on political ads, like lots of politics-themed pages, and 
share content from political groups.’ 150 


As set out in Section One, Al is set to ‘transform how brands work with 
influencers.'!5! It is likely that political campaigns will look to harness the 
power of Al to improve their engagement with these valuable individuals 
and subsequently their reach to potential voters. 


Trend 2: Cross device targeting 


Cross device targeting is a key area in ad-tech, where companies are 
developing increasingly sophisticated ways - both probabilistic and 
deterministic - to gain a ‘user centric’ view of a person, and target them 
across devices. 


The use of this technology in campaigning is already underway. Martin 
Moore's research on the role of digital marketing in political campaigns 
finds that ‘cross-device targeting is now a standard procedure for 
political initiatives and other campaigns.’ Moore and Tambini confirm 
that while ‘profiling and segmentation has always taken place [in 
political campaigns], rapid innovation makes individual level targeting 
much more efficient and sophisticated.’'5? The Democratic National 
Committee, for example, worked with ‘data services firm Experian and 
political data company TargetSmart Communications to turn its voter file 
into data that can be used to aim video ads, addressable TV spots and 
mobile and desktop display ads at specific voters.’!58 


This will increasingly allow campaigns to target individuals at specific 
times, on specific devices, when they may be more receptive to a 
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message.'54 Leading cross-device marketing company Drawbridge 
offered a suite of election services in 2016 that provided campaigns a 
number of ways to impact voters, including through ‘Voter-Centric Cross 
Device Storytelling’, ‘Political Influencer Identification’, and via ‘Real- 
Time Voter Attribution Measurement.’ !*5 


The direction of travel is towards customer segments of one. Companies 
and marketers are increasingly seeking to target consumers on an 
individual and personalised basis, across their devices, based on widely 
available data about them from multiple sources; a trend which Michael 
Schneider calls a ‘move to a people rather than places approach.'!*¢ 
One marketing agency, Stirista, offers tailored services to political 
marketers to identify people who are potential supporters and voters. On 
their website, they explain the new status quo of ‘the massive amount of 
offline and online data on voters and donors available today. Even 
newer is the technology to turn millions of records into laser-sharp, multi- 
dimensional, personal profiles, so you can reach each individual with a 
message that will move the needle.’ Stirista claims it has matched 155 
million registered voters to their ‘email addresses, online cookies, and 
social handles’, in addition to their ‘400 segmentation filters’ which 
‘combine demographic, geographical, cultural and interest-based data 
to create the precise profiles you need.’ They also claim their ‘vast vault 
of contact information contains donation history for the past two 
decades, with names and causes.’!” 


Another development in this field is the application of geo-location data 
to target people individually through social media platforms. This could 
allow parties to identify, for example, people who were present at a 
demonstration or an event. One company, El Toro, claims to be able to 
create lists of devices present at an event, then ‘map those devices 
back to the device’s home physical address, where IP targeting can 
then be implemented’. This allows campaigns to reach voters at home, 
getting a message to those who could not be contacted through door- 
to-door canvassing.'*® As noted above, this approach is likely to become 
more effective through considerable improvements underway in geo- 
location based targeting. 


Cross-device tracking is part of a larger trend: the increasing prevalence 
of household loT devices, from voice assistants and fitness trackers to 
smart washing machines, is creating novel ways to tap into a ‘captive 
audience’ (for example, people judged likely to be trying to complete a 
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household task that requires a certain product). While we are not aware 
of examples of political parties using loT advertising, it is plausible that 
political campaigners, competing to engage potential voters in new 
and innovative ways, will start to look at the possible use of such 
techniques. 


Trend 3: Growth in use of ‘psychographic’ or similar techniques 


There is a long history of personality tests being used for political 
purposes. After the Second World War, for example, concerned 
psychologists in the US developed the ‘F-scale’ in an effort to determine 
who might be susceptible to fascism.!5? This psychographic analysis is 
now grounded in big data, and many advertising firms offer the ability to 
target consumers (or voters) based on the ‘emotion’ displayed in their 
social media presence. Experian Marketing Services for political 
campaigns offers data that weaves together ‘demographic, 
psychographic and attitudinal attributes’ to target voters digitally. The 
company claims that its data enables campaigns to examine a target’s 
‘heart and mind’ via attributes related to their ‘political persona’ as well 
as ‘attitudes, expectations, behaviours, lifestyles, purchase habits and 
media preferences’ .!60 


Political parties continue to use persuasive communication tactics to 
encourage the electorate to vote for a specific candidate. This has 
included some limited use of specific personality test techniques, such as 
the OCEAN test - although there is no robust evidence on how well this 
approach works. However, the growth of newly available data sets, 
integrated into psychographic modelling, could result in new insights into 
personality types and emotional states, based on correlations derived 
from deep learning systems that are hard for a human analyst to predict, 
but which can be used to inform political advertising. 


If and when these techniques begin to yield improvements in 
commercial settings, it is highly probable that parties and campaign 
groups will continue to integrate them into their strategies, though It 
should be noted here that, again, there is no guarantee that something 
that works in a commercial context translates easily into political 
campaigning. Although studies have found some evidence of 
effectiveness for psychographic targeting in purchasing choices, there is 
at present very little evidence - at least, that we are aware of — as to its 
effectiveness in influencing political choices. One unpublished PhD study 
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has found ‘mixed evidence’ on the ability of personality traits to predict 
who is most likely to vote, and raises the question of whether personality 
traits can predict who is most likely to be persuaded by advertisements. 


A voter's mood is just one way in which technology could reveal 
individual's intimate details to political campaigners. As described 
above, companies such as L2 already offer detailed data on a voter's 
views on divisive issues like immigration and gun control - meaning 
parties can build up a checklist of precisely what political concerns an 
individual is likely to have. Methods developed by researchers for taking 
measures of a user's ‘satisfaction with life’ from online sources could be 
used, for example, to identify dissatisfied citizens who might be more 
open to messages relating to policy changes.'¢! By combining social 
media posts with heating bills with health data from a fitness tracker, for 
example, it might be possible to pick out each users’ likely political 
frustrations or aspirations, then use that to inform the content of adverts. 


Campaigning also presents a potential use case for facial recognition 
technology, though there is no indication that this sort of technology is 
currently being deployed in political campaigns. However, if such 
technology becomes unremarkable through integration into the high 
street, then it is feasible that parties might adopt similar techniques — 
analysing facial expressions of people watching television adverts or 
political debates, and seeking to tweak their messages accordingly. 


Trend 4: Use of Al fo target, measure and improve campaigns 


Given the trends outlined in the first section of this paper, it is possible that 
Al will prove better than human strategists at working out exactly who 
should be targeted, when, and with what content, in order to maximise 
persuasive potential. Such a system would be capable of pulling 
together vast amounts of data from across different sources, and 
identifying relationships likely to remain invisible to human analysts. It is 
not inconceivable that Al driven platforms might be semi-autonomously 
carrying out political campaigns in the near future.'62 


In addition to enabling better targeting, these technologies could be 
used to monitor and improve the performance of political campaigns. In 
the commercial sector, A/B testing of adverts, which helps producers 
understand what messaging results in higher clickthrough rates and more 
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conversions (i.e. message A or message B) is Commonplace. This method 
is used iteratively, constantly seeking to improve and target message 
more effectively throughout a marketing campaign. Improvements in 
audience segmentation and cross device targeting mean that specific 
messages can be tested with specific audiences. The problem of finding 
messages that are effective and resonate with potential voters has been 
common to political campaigns for a long time, but previously used to 
involve static and comprehensive ‘focus group’ testing of a narrow 
range of messages, each vetted and signed off by senior politicians.'¢ 


The use of social media and commercial advertising techniques has 
brought to this endeavour a step change in pace and scale. Moore and 
Tambini describe a ‘dynamic’ process where ‘messages are selected on 
the basis of their resonance rather than ideological or political selection.’ 
Tools such as Facebook’s Dynamic Creative, using a set of predefined 
design features of an advert, can already construct thousands of 
variations of an advert, present them to users, then find the optimal 
combinations based on engagement metrics. This is the type of feature 
that Brad Parscale, Trump's digital campaign manager, spoke about in a 
CBS 60 Minutes interview where he claims that his team tested 50,000 to 
60,000 ad variations a day.!64 


Trend 5: Use of artificial intelligence to automatically generate 
content 


One of the most important developments in Al in recent years is the 
ability to generate content automatically, which raises the possibility of 
campaigns using programmatically created messaging, developed 
specifically to convince target audiences. In perhaps the most obvious 
use case, Natural Language Generation tools could be used alongside 
algorithmic targeting in order to automatically generate content for 
unique users based on insight about their interests and concerns. In this 
case, instead of finding an optimal combination of design features 
through measuring engagement in the field, a system could use trending 
topics, personal data, and an understanding of the interaction between 
these to generate bespoke and nuanced advertising content. Such 
campaigns could combine the interactive element of chatbots with 
personal data to serve adverts that incorporate a back-and-forth 
interaction, potentially referencing previous interactions or stated 
concerns with new generated pieces of content. Taken to its logical 
conclusion, this could lead to a stream of unique, personalised messages 
targeted at each voter constantly updated based on A/B testing. 
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This technology can already been applied in the in the use of 
commercial chatbots, occasionally acting as pseudo-shopping 
assistants, or surreptitiously pretending to be humans on social media. As 
conversational technology improves, particularly in verbal machine-to- 
human communication, talking with a chatbot will feel more natural, 
and become more personalised. This familiarity could help chatbots find 
a niche in political communications, employed by campaigns as a 24/7 
helpline for policy questions or as a dynamic cheatsheet to help provide 
quick and informative answers during canvassing. A current example of 
this can be found in the nonpartisan chatbot HelloVote, which was used 
in the 2016 US presidential election. It assisted eligible voters to register to 
vote, checked whether they had registered, and helped them gain the 
ID required in their state. Users could communicate through text and the 
HelloVote system either directly filled out online voter registration forms or 
sent pre-filled forms to the users to submit.!¢5 


Not all of this use is likely to be positive. Al generated content has been 
used for misdirection and disinformation, and social media bot accounts 
(both private and state-supported) have been used to flood online 
spaces with masses of false information with the intention to shut down 
conversations.'% Similarly, recently invented technologies that generate 
content are likely to be used to mislead and confuse. Technology which 
generates photo-realistic images, imitates real voices and hallucinates 
faces has clear potential to promote misinformation, providing false 
‘proof’ that politicians have said or done something scandalous. '6” 


Trend 6: Using personal data to predict election results 


Political parties conduct polling during election campaigns — to gauge 
overall results, assess reception to policies or leaders, and even identify 
key issues voters are worried about. In recent years campaigns have 
started to use social media in an attempt to discover what people are 
concerned about, and thereby to predict poll results. A recent study by 
Rossini et al found a strong relationship between a candidate's 
performance in public opinion polls and the types of messages shared 
on their social platforms. Their findings conclude that candidates who 
polled higher were more active on Facebook and Twitter and were more 
likely to use these channels to attack opponents or advocate for 
themselves.!4 
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One 2017 study explored the link between political liking behaviour and 
actual voting intention, and found that liking politicians’ public 
Facebook posts can be used as an accurate measure for predicting 
voter intention. Interestingly, this research concludes that even a single 
selective Facebook like can reveal as much about political voter 
intention as hundreds of heterogeneous likes.!? Another recent study 
looked at tweets from the 2016 United States presidential election, and 
sought to find a correlation between tweet sentiment and the election 
results. After comparing Twitter data to the results of the Electoral 
College, they found that sentiments expressed online corresponded with 
66.7 per cent of the actual outcome of the vote.!7° 


As described in Section 1, approaches such as multimodal analysis are 
increasing the accuracy of sentiment analysis technology. We anticipate 
increasing use of social media sentiment analysis to both gauge 
reception to candidate's speeches or events; and as a way to predict 
the results of elections themselves - although given the variable 
accuracy levels, this will not replace traditional polling in the foreseeable 
future. 


Trend 7: Delivery via new platforms 


Digital video, wearable tech, and VR are all increasingly important forms 
of delivering content to users. This is especially true for video, which, 
viewed on phones and other devices, is considered a highly effective 
way of delivering emotional content on behalf of brands and marketing 
campaigns. YouTube has therefore become an important platform for 
political ads, with the company claiming that today, voters make their 
political decisions not in ‘living rooms’ in front of a television but in what it 
calls ‘micro-moments’ as people watch mobile video.'7! 


The growth of video creates new threats. Late 2017 saw the rise of 
‘deepfakes,’ the application of face-swapping algorithms (among other 
applications) which allow campaigners to ‘put words into’ their 
opponents’ mouths. A recent report on Al and security threats warned 
that ‘Al systems may simplify the production of high-quality fake video 
footage of, for example, politicians saying appalling (fake) things.''72 This 
could mean that ‘in the future Al-enabled high quality forgeries may 
challenge the “seeing is believing” aspect of video and audio 
evidence.’!73 This might not be of immediate concern relating to 
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personal information — unless of course it involves the use of personal 
information of whoever was faked. 


Just as companies are starting to use virtual and augmented reality 
platforms to sell products and services, there is also a rise in VR 
campaigns aimed at opening political debate. In a recent project by 
director Alejandro Iñárritu, VR was used to recreate the experience of 
Mexican immigrants crossing into the United States.'74 The project is not 
alone in its intent to drive social issues forward using these 
technologies.!75 


The increase in ownership of loT devices - in particular wearables - is 
recognised as a new growth area for marketing, especially when 
combined with location data. As this technology is in its early days in the 
commercial sector, there have not yet been any applications to political 
campaigns - at least as far as we are aware. It is, however, an area that 
could be utilised in the future. 


In addition to advertising on social media, more traditional platforms - in 
particular television - are being used in new ways by political campaigns. 
The increase in the collection and use of television data (including set- 
top data and smart TV device data) is facilitating better targeted 
television advertising.'76 Political campaigns are at the forefront of this 
technology, using ‘second-to-second viewing data’, amplified with 
‘demographic and cross-platform data from a multitude of sources’, 
acquired via information brokers, to deliver more precise adverts.'!77 NCC 
Media, the US cable TV ad platform owned by Comcast, Cox, and 
Spectrum, provides campaigns with the ability to target potential voters 
via the integration of its set-top box viewing information with voter data 
from Experian and others.!78 


Due to its enduring importance, targeted TV advertising merits a little 
more explanation here. There are two types of television adverts: 
addressable (where specific individuals are targeted, meaning that two 
people watching the same station at the same time might each see 
different ads) and predictive adverts (everyone in a given market sees 
the same ads, but ads shown during particular shows are singled out by 
political groups based on the high probability that key voters watch 
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those programs). According to commentators addressable political 
advertising is more commonly used by political campaigns; with the 
catch that it’s also more costly. As a result, political campaigns are 
increasingly merging together various data sources to understand which 
programs certain voter segments are most likely to watch.!7? In a vivid 
illustration of this, Forbes reported recently that services like Deep Root 
Analytics helped the Trump campaign identify police procedural ‘NCIS’ 
as popular with anti-Obamacare voters, while ‘The Walking Dead’ was 
preferred by those who favour stricter immigration laws. !8° 


Ken Strasma, CEO of HaystaqDNaA, an analytics firm that has worked on 
campaigns for Barack Obama and Bernie Sanders, is quoted as saying 
that they haven't yet been able to leverage social media data to 
improve TV targeting: ‘Unfortunately, as it stands now, the number of 
people whose identities we could match and who also had addressable 
TV was too small to make it practical.’'8! However, recent patents 
describe significant advances in cross device targeting. For example, 
Intel IQ's patent describes the ability to target a television advert based 
on a user profile collated from multiple online devices, only one of which 
needs to be associated with the set-top box (the device that connects a 
television and a signal source). The current limitations of targeted 
political advertising therefore seem unlikely to persist. 
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KEY CHALLENGES 


Below, we set out what we consider to be the main challenges relating 
to political campaigning as a result of new technologies. As the explicit 
focus of this paper is on privacy risks or challenges, we have not 
discussed the many benefits to campaigns and elections that these 
trends might also create. We have also limited ourselves to legal 
methods that legitimate and lawful political parties might adopt - and 
not those from malicious actors. 


Privacy 


There are several privacy risks resulting from the latest developments in 
data exploitation and targeting. Some of these will arise from new 
consumer technologies. A recent report by the Royal Academy of 
Engineering, for example, warned that loT devices posed a particular 
new risk to privacy, as a person's location could be more easily identified 
through analysis of data collected through IoT devices.'82 The majority of 
these risks relate to consumer privacy in general, rather than political 
campaigns specifically. Nevertheless, if political campaign data use 
follows the same route as marketing and advertising there will be a 
greater drive toward both cross-device targeting and individualisation - 
i.e., the aiming of political messaging at unique individuals. As a result, 
campaigns will be incentivised to hold or obtain more personal data on 
individuals, and to collect as much diverse data as possible, in order to 
maximise the effectiveness of their messaging. 


User consent and knowledge 


The more complicated and automated the process of data use and 
targeting becomes, the more difficult it will become for users to ask for a 
clear explanation about how their data is used, and to know whether 
they can ask for it back. As big data and algorithmic technologies are 
often highly complex, and Al led processes are typically difficult to 
scrutinise and explain, the principle of ‘informed consent’ will become 
increasingly difficult to apply even to ‘everyday’ forms of data 
processing. As a result, it will be hard for people, as well as parties, 
regulators and campaign groups, to understand how collected data is 
being used. This will especially be the case with cross-device targeting, 
since it might not always be clear to the user who is actually responsible 
for the targeting. 
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Inappropriate profiling & messaging 


There is potential for automatically generated content to become an 
important part of political campaigning in the coming years. One 
possible scenario involves Natural Language Generation being 
employed to automatically create tailored content for each voter, 
based on data held about them or acquired by parties. This would 
amount to the automation of political campaign content. These 
methods might include a constant and intensive process of iterative A/B 
testing and production, outputting content at a scale and pace beyond 
the capacity of any human editor. This could result in inappropriate, 
inaccurate, or prejudicial adverts appearing. While this form of 
advertising is will not necessarily be illegal (although in some cases may 
be), its existence could also undermine public confidence in the ability of 
regulators to ensure political campaigns are run fairly and increase 
public distrust of political parties. 


Accountability 


One of the major trends over the next 2-5 years will be the widening 
availability of currently state-of-the-art artificial intelligence, automated 
Natural Language Generation, and user targeting. It is our view that 
algorithmic marketing techniques will soon become available to all 
political parties, enabling them to routinely run thousands, perhaps 
millions of algorithmically tuned messages. This scale is might overwhelm 
regulators, who could find it difficult to effectively monitor political 
adverts that are shown. Regulators are advised to consider how high 
volumes of messages might be stored and shared, potentially with the 
wider public, in such a way that they can be reliably checked. Unless 
this can be achieved, there is a risk of growing concern relating to the 
transparency and political accountability of campaigns. 


Emotional manipulation 


The technique of using psychographics, or broader emotional targeting, 
is likely to improve with the creation of very large and cross-referenced 
datasets. It is probable that, in an effort to increase the inferential power 
of targeting algorithms, personality survey data will be linked with ever 
larger and more varied data sets from loT devices, location histories and 
social media. This will helo campaigns build up correlations between 
personality types / moods / psychological states and patterns of 
behaviour. In advertising, this will lead to new methods for targeting 
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consumers at the time and place that they're most receptive, using 
content that reflects their personality. 


As much of this will necessarily be automated, handing control to an Al 
based system could potentially sometimes result in political parties 
targeting people who are extremely depressed, anxious or suffering from 
particular psychological difficulties with adverts designed to appeal to 
them. As well as being ethically questionable, this could cause 
reputational damage to the party running the campaign if discovered - 
though the fragmentation of individual targeting is likely to mean many 
cases go unnoticed. In general, we believe there is a danger that 
political messaging will become more emotional in tone, appealing 
more often to anger, frustration or prejudice, in an attempt to mobilise 
voters and maximise engagement with content. This is likely to have 
other, longer term effects on the health of democracy. 


Competition 


According to various analysts, there is evidence that a small number of 
companies — notably Facebook and Google - are becoming 
increasingly dominant in terms of online advertising.'8§ The continued 
improvement of their services to target people — including custom and 
lookalike audiences, along with the ability to reach into people’s homes 
through advertising by personal assistants — is likely to mean these 
companies become increasingly important for online ad spend during 
elections. As the academics Moore and Tambini set out in their recent 
paper on the subject, ‘the continuing dominance and monopoly 
positions, particularly by opaque foreign companies, are likely to be 
particularly corrosive of trust, fairness, and legitimacy.’!®4 


Creation of new forms of personal data 


Likely improvements in generating inferred data — where probabilistic 
inferences are made about a user based on an analysis of available 
data points — will create new categories of data. For example, on current 
trends it might be possible to very accurately determine an individual's 
sexual preference, political persuasion, voting history etc, based ona 
sophisticated analysis of metadata arising from their device use, 
browsing habits and so on. If this process can be made sufficiently 
accurate, it would allow for the inference of what is arguably personal 
data, generated by a private company without the knowledge of the 
individual concerned. Of particular interest for political campaigns will 
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be ‘persuadability’ to certain political messages - whether about the 
economy, immigration, the environment or so on - which could help to 
determine a person’s political persuasions. In these circumstances, it is 
hard to see how the user would know this data exists, or exercise their 
rights to have it removed, corrected, or processed in any other way. 
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